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Abstract 

Social connectivity is the key process that characterizes the structural properties of social networks and in turn processes 
such as navigation, influence or information diffusion. Since time, attention and cognition are inelastic resources, humans 
should have a predefined strategy to manage their social interactions over time. However, the limited observational length of 
existing human interaction datasets, together with the bursty nature of dyadic communications have hampered the observation 
of tie dynamics in social networks. Here we develop a method for the detection of tie activation/deactivation, and apply it to a 
large longitudinal, cross-sectional communication dataset (^ 19 months, « 20 million people). Contrary to the perception of 
ever-growing connectivity, we observe that individuals exhibit a finite communication capacity, which limits the number of ties 
they can maintain active. In particular we find that men have an overall higher communication capacity than women and that 
this capacity decreases gradually for both sexes over the lifespan of individuals (16-70 years). We are then able to separate 
communication capacity from communication activity, revealing a diverse range of tie activation patterns, from stable to 
exploratory. We find that, in simulation, individuals exhibiting exploratory strategies display longer time to receive information 
spreading in the network those individuals with stable strategies. Our principled method to determine the communication 
capacity of an individual allows us to quantify how strategies for human interaction shape the dynamical evolution of social 
networks. 
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MANY different forces govern the evolution of social rela- 
tionships making them far from random. In recent years, 
the understanding of what mechanisms control the dynamics 
of activating or deactivating social ties have uncovered forces 
ranging from geography to structural positions in the social 
network (e.g. preferential attachment, triadic closure), to ho- 
mophily [ 1 ]. These finding are pervasive in empirical analyses 
across cultures, communication technologies and interaction 
environments l2lfTT1l. 

However, the incorrect assumption that time, attention and 
cognition are elastic resources has blurred the study of how 
individuals manage their social interactions over time I12UT41. 
Understanding such social strategies is not only of paramount 
importance to make progress in the characterization of human 
behavior, but also to improve our current description of social 
networks as evolutionary objects against the (aggregated) ever- 
growing or static pictures of the social structure. 

Several reasons have hampered the observation of tie ac- 
tivation/deactivation dynamics in social networks at large 
scale: on the one hand, studies of diffusion based on datasets 
from pre-electronic eras have safely assumed that tie activa- 
tion/deactivation is a much slower process than interactions 
within a tie, and thus their dynamics might be safely neglected 
|[T5HT7l . However, the current ability to communicate faster 
and further than ever accelerates tie dynamics in an unprece- 
dented manner to the point that tie activation/deactivation may 
rival in time with processes like information spreading. On the 



other hand, available data about how ties form or decay were 
restricted to egocentric, small social networks and/or short pe- 
riods of time which made it difficult to assess the universality 
of the results obtained and their extension to other situations 
0. Finally, although in some online social networks there 
are explicit rules for the establishment of social ties, in most 
cases activity is the only way to assess the existence or not of 
the tie 1 18, 19]. Online social networks are plagued with this 
problem due to the cheap cost of maintaining "friends" which 
are in fact already deactivated relationships |20'|. However, 
using activity as proxy for tie presence is a problem in most 
communication channels like mobile phone calls, emails, elec- 
tronic social networks etc., since tie activity is very bursty 
1 1 ] and so far there is no clear method to discriminate those 
social ties that are already inactive from large-inter even times 
within active relationships [ 42l . 

1 . Detection of tie activation/deactivation 

To study the formation and decay of communication ties, we 
study the Call Detail Records (CDRs) from a single mobile 
phone operator over a period of 19 months. The data consists 
of the anonymized voice calls of about 20 million users that 
form 700 million communication ties. After filtering out all 
the incoming or outgoing calls that involve other operators, we 
only consider users that are active across the whole time period 
and retain only ties which are reciprocated. We refer to SI 
Sections A&Hfor further details about the processing and the 
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Figure 1 . Detection of tie activation/deactivation: Schematic view of the time intervals considered in our database and the different 
situations of tie activation/deactivation and the interplay between the tie communication patterns and tie activation/deactivation for a 
given observation time window £1 of length T = 1 months (shadowed area). Each line refers to a different tie while each vertical 
segment indicates a communication event between i «-» j and Stfj is the inter-event time in the / «-)• j time series. 



sampling of the datasets and for the comparison with another 
(smaller) database of Facebook communication through wall 
posts. 

In most studies of communication networks a tie is as- 
sumed to be present if it shows any activity in the observation 
window l22l . However, since communication is bursty Q, 
large inter-event times between interactions are likely and 
thus they might be unobserved or mistaken as tie decay or 
formation, specially if the observation window is short (see 
Fig. [T] and SI Section A). For example, in our call database we 
find that the average time between tie communication events 
is (Sttj) = 14 days (with a = 18 days) and thus we might 
get spurious effects if the observation window is of the or- 
der of months, as repeated interactions may fall outside the 
observation window l23l . 

To overcome this we propose a different method to asses 
whether a tie has been activated/deactivated in the observation 
window £1. The method is based on the observation of tie 
activity in a time window before/after Q: if tie activity is 
observed in the 6 months before £1 then it is considered an old 
tie [cases (a) and (d) in Fig.[TJ; on the other hand, if activity 
is observed in the 6 months after £1 we will assume that the 
tie persists [cases (b) and (d) in Fig.[TJ. In any other case, we 
will consider that the tie is activated and/or deactivated in £1 
[cases (a), (b) and (c) in Fig.[TJ. Of course, it is possible that 
even if there is no communication before/after the observa- 
tion window, the tie is still active after/before our database. 
This would require that the tie has an inter-event time Stfj 
bigger than 7 months, i.e. case (e) in Fig. [T] However, in 
our database, only 3.5% of the links have such a long inter- 
event time which validates the accuracy of our definition of 
tie activation/deactivation. See SI Section B for details on our 
discrimination method. 



2. Communication capacity and activity 

The procedure described above allows us to determine the tie 
activation and deactivation events for each individual along 
the observation period of 7 months (see Fig. [2]). With those 
events, we build her instantaneous communication capacity 
Kt(t), defined as the number of active ties at any given in- 
stant t. In principle, K((t) is very different from kj(t), the 
aggregated number of revealed ties up to time t, which is 
usually what is taken as a proxy for social connectivity l23l . 
Because of the bursty nature of interactions, k((t) has a fic- 
titious nontrivial time dynamics at the beginning of the ob- 
servation period which is typically ignored in observations 
(see SI Section B for its implications). However, if we aggre- 
gate the number of activated (deactivated) ties up to time t, 
denoted by n a ,i(t) [wg),/(0L we g et that at the end of £1 we 
have k((T) = jq(0) +n a ,i(T). Thus k((T) is a combination 
of the communication capacity and communication activ- 
ity in Q.. In our database we find a large heterogeneity in 
n a ,i(T) and n^^T) [see Fig. 15k]: while on average people 
activate/deactivate about 8 (reciprocated) ties in a period of 
7 months, 20% of users in our database activate/deactivate 
more than 15 ties in that period. Note that on average n a j(T) 
and ri(Q y i(T) almost equals k((T)/2, (see Fig.[3t), which sug- 
gests that a large fraction of the revealed aggregated social 
connectivity k((T) is given by newly activated or deactivated 
connections; similar ratio of activation/deactivation is found 
in the Facebook database (see SI Section H). Thus, kt(T) usu- 
ally overestimates the instantaneous human communication 
capacity of maintaining active social ties. 

The imbalance between the number of activated or deac- 
tivated ties measures how communication capacity changes. 
At the end of the observation period the change is K((T) — 
jq(0) = n a ,i(T) — n^co{T). Interestingly, we find that for most 
users in our database we get n a ,i(T) ~ n^^T) (see details 
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Figure 2. Communication capacity and evolution of activity: Panel (A) shows the communication events of a given individual 
in our database with all her neighbors in the observation window £1. For each tie id, a vertical line represents a call with the 
corresponding neighbor. Grey horizontal rectangles are drawn from the first to the last observed communication event in each tie, 
considering also events before and after £1. Panel (B) shows vertical up/down arrows for each tie activation/deactivation events 
detected within £1. Using those events, panel (C) shows the aggregated number of active ties as a function of time /q(0) -\-n a ,i(t) and 
the aggregated number of deactivated ties n^^. Dashed line is the apparent growth in the social connectivity k[(t) obtained by the 
cumulative number of observed activity in ties up to some time, while red line is the number of active connections at a given instant 



in Fig. [3])). This means that there is a conservation principle 
in social communication, where the number of deactivated 
ties equals the number of activated ties in our observation 
window £1 such that the total number of active ties remains 
almost constant after T = 7 months. This conservation of 
communication capacity not only happens at this particular 
time scale T but also instantaneously: as seen in Fig. [2]: for a 
particular user and in the SI Section D we find that for around 
90% of the users tie activation/deactivation happens linearly 
in time so that n a ,i(t) — OCtf and n^^ ~ u^, where 0^ and (Ot 
are the rates of tie activation/deactivation and at ~ cat (see 
Fig. [3}:). These two facts have a remarkable consequence: 
despite ties are activated/deactivated continually, the commu- 
nication capacity for each individual remains almost constant 
throughout the observation period K((t) ~ Kf, signaling that 
people tend to balance the activation/deactivation of ties in 
such a way that the number of active relationships remains 
stable over time. The conservation of social capacity is the 
root of many observations in the literature (see for example 
HIED) tnat tne distribution of connectivity in social networks 
seems to be stable in time but the neighbors of a given node 
change from one time window to another one. Specifically, 



we find that the average user social persistence pi, measured 
as the fraction of neighbors present at the beginning of the 
observation window £1 that remain active until its end, lies 
around 75%. This means that users renew their social circle 
slowly, in line with studies in off-line social networks 0. 
This value is much larger than what is expected in a model 
where all ties have the same probability to be activated or 
deactivated, in which case we obtain p' i = 50% (see SI Sec- 
tion F). Our results corroborates that the way in which people 
activate and deactivate ties from their social network is not 
random; instead, some existing ties are more probable to be 
deactivated than others. 

Thus, individual communication can be characterized in 
terms of his communication capacity Kt and his communica- 
tion activity n a j (or rate a^ in a time window. These two 
quantities give information about two related although not 
equivalent features of social communication. While the capac- 
ity is a measure of the number of relations that a user manages 
instantaneously, the activity is instead related to the number 
of relations a user establishes and at what rate. However, as 
shown in Fig.|4| we observe for a large part of the individuals 
that n a j — P Ki with j3 = 0.75, meaning that the number of 
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Figure 3. Characterization of communication capacity and activity (A) Probability distribution function (pdf) of the aggregated 
social connectivity ku number of created ties n a j and number of deleted ties n®^ at f = T, compared with the pdf for the average 
communication capacity Ki over the observation window. (B) Relationship between the number of formed n a ,i and decayed n ffl) / 
ties in the observation window for the users in our database: the results form the PCA indicate that the 93% of the variation can be 
explained by the first component in the (0.70, 0.71) direction, i.e. almost the black line n a ,i = n&j, in the plot. Furthermore, the box 
plot shows the 25% and 75% percentiles (filled box) and 5% and 95% percentiles (whiskers) and the blue curves correspond to the 
5% and 95% percentiles of the corresponding Poisson null model for our data (see SI Section E). (C) Density plot p (log ft)/, log a*) 
for users with more than 5 ties formed and decayed. Dashed line is the a* = ft>/ relationship and the curves correspond to the contour 
lines p = 0.01 for the density of actual values of rates (red) and the ones obtained in the Poissonian null model (blue, see SI Section 
E for further information). 



created connections tends to be proportional to the commu- 
nication capacity. This correlation resembles the preferential 
attachment process by which tie activation is more probable 
for more connected individuals. Note however that we find 
that tie activation is here proportional to a conserved quantity 
and thus grows linearly in time for t ^> 1 ; and on top of that, 
there is a corresponding preferential de-attachment mecha- 
nism meaning that individuals with large Ki are also more 
likely to deactivate ties. Although the dependence n a j ~ j3 lq 
explains most of the observed behavior (80% of variance in 
PCA), there is a still a large variability in our database so that 
tie evolution cannot be explained solely by K(. As shown in 
Fig. [3| for a given number of people contacted in the obser- 
vation period h(T) there are many possible combinations of 
social activity n a t and capacity Ki which yield to the same 

un 

2.1 Lifetime evolution and sex differences 

Although the communication capacity and activity remain 
mostly stable over the observation time window Q, they tend 
to change gradually during the individual life course. Specifi- 
cally, as shown in Fig.[4| we observe that as people get older 
the size of their social circle (ki = n a j + Ki) decreases. This 
decrease in both the communication capacity and activity 
observed in Fig.[5]is in line with previous studies on the life- 
time evolution of the cognitive and communication capacity 
of individuals |25ti27l . Specifically, changes in egocentric 
network size across the individual lifespan are usually associ- 
ated to both experiencing age- specific life events and social 
goals 1 28]. Other studies relate the decrease in the social 
engagement (number of social contacts, interaction activity, 
frequency of communication) across the individual lifespan, 
to a decrease in the cognitive capacity [27 1. Our decomposi- 
tion of ki as a combination of n a and jq allows us to better 
understand the change in social network size across the indi- 



vidual lifespan and its relation with individual communication 
strategies. 

Although the trend in vital trajectories does not change sig- 
nificantly with the gender of the individual, interesting differ- 
ences are observed between men and women social strategies 
(Fig. [4]). First, in line with recent studies using mobile phone 
records l3Tll43lL we found that on average women maintain 
smaller social circles than men, which seem to happen re- 
gardless to their age. Interestingly, communication activity 
and capacity have a gradual change over the lifetime of men, 
with no significant drop before the 60s. On the other hand, 
women have a clearly marked difference between adolescence 
(< 16years) and the rest of their lifetime. 

3. Social strategy 

As we show in Fig. [5] there are many different combinations 
of communication capacity Ki and activity n a ^ which yields 
to the same number of tie activations/deactivations in the 
observation window k^ We encode that disparity in the ra- 
tio Yi — n>a,i/Ki which we dub as social strategy and gives 
information about the balance between the communication 
capacity and the communication activity for a given node: for 
Yi — P (the average behavior), users have a normal or balanced 
social strategy between their communication capacity and ac- 
tivity. Outside this group we find those users with ^ <C j3 that 
activate/deactivate a small number of connections compared 
to their communication capacity, or users with Yi^> P who 
have a large communication activity compared to their com- 
munication capacity. We refer to these two strategies as social 
keeping (Yi <C /3), meaning that these individuals keep a very 
stable social circle, and social exploring (Yi ^> j8), meaning 
that these individuals activate new ties and deactivate existing 
ones at a high pace. 

In the following we study how such different social strate- 
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Figure 4. Variability of communication capacity and activity: (A) and (B) show different snapshots of the neighborhood of two 
different individuals (in red) at 4 equally spaced times in the observation time window t = 52, 105, 158, and 21 1 days. Each black 
(grey) line corresponds to an active (inactive) tie at that particular instant. (C) Log-density plot of the communication activity n a ,i as 
a function of the communication capacity /q for each individual in our database. Solid line corresponds to the line n a ^ = 0.757Q 
obtained through PCA. Dashed curves are the iso-connectivity lines k[ = Ki + n a j for fy = 10, 20, 50. 



gies relate to topological properties and impact the local and 
global network dynamics as they operate in the time-scales 
relevant for viral information diffusion. 

3.1 Relation to topological properties 

We find a significant correspondence between social strategy 
and individuals' local network topology. As mentioned above, 
users show on average a 75% persistence in their ties in 7 
months, where the persistence is measured as the fraction 
of initial ties that remain active during the whole £1 (see SI 
Section F). However, as shown in the SI Section F this value 
rises up to 90% for social keepers with y < 0.2 and is only 
52% for social explorers with y > 2. A similar dependency is 
found for the (aggregated) clustering coefficient q: as shown 
in the SI Section F for a fixed ku the clustering coefficient 
for social keepers doubles that of social explorers, meaning 
that for equal k( the former have less distinct social contexts 
or structural diversity |34| than the latter. Finally, we find 
that along with the assortativity of k( in the social networks 
we get a large assortativity of social strategies with a Pearson 
coefficient p(y, Ynnj) ~ 0.3 (see SI Section F for further de- 
tails). This means that social explorers/keepers tend to gather. 
These findings render a dynamical picture of the network with 
very different evolution rhythms: highly clusterized and al- 
most static areas of social keepers live together with extremely 
volatile groups of social explorers. 

Our analysis of the Facebook communication dataset shows 
that these patterns also hold for users interacting online (see 
SI Section H ). 

3.2 Information diffusion 

Finally we investigate whether social strategies have an im- 
pact in an individual's capacity to access information being 
propagated in a network. To address this, we have run the 
Susceptible-Infected model on the real sequence of CDRs. In 
a way analogous to previous works (32H35), we start the sim- 
ulation by infecting a random node at a random time instant 
and considering all other nodes as susceptible. At each call, 
if either involved nodes is infected, the susceptible one will 



be infected too. This maximal spreading process generates 
a viral cascade which continues until all reachable nodes are 
in the infected state. We repeat the simulation for 10 4 ran- 
domly chosen seeds. For each individual we then measure 
the infection time tj n f as the time difference between the time 
at which she received the information and the time at which 
the corresponding cascade was initiated. Obviously, for a 
given individual, the infection time decreases with her total 
connectivity k( and het total number of communication events 
Wf. the more connections an individual has and the more she 
interacts, the sooner she receives the information. But when 
we control for k\ and wu we observe that on average there is a 
dependence between how stable the social strategy is and the 
infection time (Fig. [6]). Interestingly, we observe that social 
explorers (y > 2) have a relatively larger time (roughly 2-3 
days of difference) to awareness of the information compared 
to social keepers (y < 0.2). 

We observe that only some combinations of node strength 
and social strategy are possible. With low to moderate lev- 
els of exploration in social strategies (y < j3) it is possible 
to reach a wide range of node strengths, with a sweet spot 
in connectivity that allows individuals to lower their time to 
access information. However, with y > jS the number of nodes 
with high strength decreases exponentially: highly exploratory 
individuals display a very low level of communication events 
and therefore a very large time to receive information circulat- 
ing in the network. This result suggest that the information 
access benefits of diverse ties are outweighted by their short 
time lifespan, resulting in a net delay in access to information 
from the individuals activating them. 

4. Discussion 

Our insights can be seen, in essence, as the individual-level 
dynamical version of the tie-level static results reported by 
Onnela et al. (22). The authors analyzed 18 weeks of mobile 
phone call records from 7 million people and showed that, in 
terms of information diffusion, ties with low cumulative com- 
munication time (strength in our context) are ineffective at 
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Figure 5. Sociodemographic dependence of the capacity and activity: (A) Average value of the social capacity 7q and the 
activity n^j for groups of users with different age and gender Dashed lines correspond to the average of /Q and n a j in the complete 
database and the solid line is the line n a j = /3 Kf obtained through the PCA in the complete database. (B) Average values for the 
activity and capacity of users grouped by gender. 



information transfer. Our results clarify that these ties are dis- 
proportionally generated by social explorers, and that they are 
mostly activated and deactivated in a short time span. In fact, 
we find that the average tie weight of each individual Wy (mea- 
sured in terms of average number of exchanged calls per tie) 
is negatively correlated to the social strategy Ji with a Pearson 
coefficient p(log%,logw^) ~ —0.32 ±0.01, indicating that 
on average weak ties belong mostly to social explorers. Note 
that these highly time-localized communications differ from 
the conventional wisdom about weak ties. Typically, in fact, 
weak ties are seen as bridging connections that span remote 
parts of the network permanently, since they are considered ac- 
tive over the whole observation period [ 36 - 39 ]. In our dataset, 
instead, this happens with low frequency. Although a detailed 
analysis of what constitutes a weak ties is beyond our scope, 
we find that of all ties with less than 10 calls (corresponding 
to 50% of the whole population of ties), only almost 20% of 
them remain active during the entire observation window. This 
is also consistent with the "Diversity-Bandwidth Tradeoffs" 
observed in corporate email communication datasets from two 
medium sized firms (107 people over 10 months; 214 over 12 
months). The authors found empirical evidence that people 
who form ties to disparate parts of the social network at the 
cost of reducing their band- width of communication can have 
disadvantaged access to novelty they receive l40ll4T1l . Our 
simulation results support this result for a large scale social 
network and connect it to measurable individual strategies. 

Although, as we have seen, the adoption of social strate- 
gies does not seem to depend on the magnitude of activity and 
capacity, we have found them to be assortative. In addition, 
despite we cannot establish causality with our methodology 
and observational period, it is an interesting question whether 
social strategies can be behind the homophily in static topo- 
logical properties, which has been observed in a wide range 
of real social networks 1 351 . 

These findings document an important contrast between 



possible social dynamics: for almost any given k[ we can 
find social explorers with that connectivity that navigate the 
network for new ties and thus have larger structural diversity, 
as well as social keepers, more conservative individuals who 
focus attention to their stable social neighborhood. In other 
words, individuals can exhibit exploratory or stable strategies 
at multiple scales of connectivity, and these strategies have 
more important impact in the resulting network properties, 
ranging from cohesiveness to information diffusion, that the 
total number of contacts they are able to initiative or receive. 
This result is important as it provides conclusive evidence 
for the divergence between the static and dynamic character- 
izations human interaction. Fine-grained, longitudinal and 
cross-sectional data as the one presented in this study are then 
needed to fully understand processes such as navigation, influ- 
ence and information diffusion as they happen concurrently 
and possibly entangled to the unfolding of social strategies in 
time. 
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A. Preparing and Sampling the Data 

The data used in this study has been obtained from the Call 
Detail Records database of a unique mobile phone operator 
in a single country. We focused exclusively on voice calls 
records, filtering out short text messages, multimedia mes- 
sages and operator calls. Each subscription is anonymized 
such that it is not possible to recover personal information of 
the users. We filtered out all the incoming or outgoing calls 
that involve other operators due to the partial access we have 
to the activity of other providers. To avoid business-like sub- 
scriptions, which usually appear as users with a huge number 
of connections and calls never returned, we only retain ties 
which are reciprocated, which leads to the removal of about 
the 50% of the total links in our database. This restriction also 
eliminates calls to wrong numbers, telemarketing-type calls, 
customer service lines, etc. Within this approach, we neglect 
the directionality of links and consider a call from i to j equiv- 
alent to a call from j to i l22l . The resulting mobile graph 
contains the communication of about 20 x 10 6 and users over 
a period of 19 months from February 2009 to August 2010. 

To disentangle the dynamics of ties creation/removal from 
their call activity, we split the 19-months period into 3 subin- 
tervals (Feb09 - Jul09, Aug09 - FeblO, MarlO - AuglO), (see 
Main text Fig. [7]). We have only considered the evolution 
of the ties and nodes that show any activity in the 7 months 
observation window £1. The resulting graph in £1 contains 
16 x 10 6 individuals and 130 x 10 6 ties. The intervals before 
and after are used to assess respectively whether the ties exist 
from before and/or persist after £1. Fig. [T] shows the different 
situations that can occur for a given tie. In particular, in our 
database, the 12.5% of links belongs to the category (a), the 
14.5% to (b), the 22.2% to (c) and the 47.3% to (d), while 
only the 3.5% of the links, which belong to category (e), will 
be missed in our analysis. 

Since we are interested only in tie dynamics between 
individuals, we have to take into account the problem of sub- 
scription and churn of users in our database. For example, 
subscription of a new user and its communication with other 
users in our database results into formation of many new ties 
for the new subscriber. The same would happen for the decay 
of ties of a subscribe that churns from the company. To miti- 
gate this problem, we only keep active users in our data set: in 
particular, we only consider those users who are involved (as 
calling or as called party) at least in one communication event 
in each of the three subintervals in the 19 months and also 
if they are present in the database at least one month before 
£1 and are still active one month after £1. This latter filter 
prevents spurious effects in the analysis of tie dynamics just 



because individuals subscribe/unsubscribe just before/after 
Q; for example, we could have observed an apparent rapid 
growth of their social network at the beginning of the obser- 
vation window or a fast dissolution at its end Q . This results 
in the removal of about the 17% of nodes and the 37% of 
reciprocated links within £1. 

In our database, we also have information on the age and 
gender of users of a random chosen fraction (40%) of them . 
We found that the minimum and maximum values of age are 
respectively and 97. However, we only keep users whose 
age lies between 16 and 70 years old in order to yield a more 
reliable dataset. This filtering led to the removal of the 0.5% 
of users which demographic data. 

B. Entanglement between bursty activity and 
tie dynamics 

As stressed in the main text, one of the most challenging prob- 
lems in the study of the dynamics of tie creation and removal 
is to identify whether a tie is actually a new/old connection. 
Although in most social networks there are specific events 
for the formation of new "friends" (or followers) or the cor- 
responding "unfriending" events, due to the cheap cost of 
maintaing those connections most of those ties are abandoned 
and thus activity between individuals is the only way to asses 
the existence or not of that relationship. 

However, human activity is bursty, meaning that there are 
large periods of inactivity followed by bursts of activity Q. 
This means that within a particular tie i <-> j the time between 
consecutive communication events Sty is heavy-tailed dis- 
tributed. In our database we find that this is indeed the case 
and in line with l32ll33l we find that there is a universal law 
for the distribution of inter-event times (see Fig. [7}. In partic- 
ular, we find that for a particular tie P(Sttj) = 3? (Sty /Sty) 
where &(x) is a heavy tailed universal function. Since bursty 
behavior seems to be universal in human activity Q, it has a 
deep impact in the understanding of tie dynamics and translate 
in a ubiquitous problem in the empirical observation of social 
networks: if the observation window is very short we might 
miss most of the ties since there is no communication in that 
period of time. But on the other hand, since the inter-event 
time distribution is heavy-tailed we might have to go to large 
observation windows to recover most of the ties. For example 
in our database we find that the average inter-event time is 
(Stjj) = 14 days with a standard deviation a = 18 days, which 
means that the observation period must be larger than a 2-3 
months only to observe (at least once) most of the ties in the 
social network. In our case, £1 extends over 7 months and 
using the previous/next 6 months intervals we calculate that 
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3% ties did not show activity in £1 and then could have been 
missed if only data within £1 was present. 

Although the impact of burstiness in the observation of 
ties is important, it becomes critical for the problem of tie 
formation/decay since it is not only necessary to observe 
the tie but to asses its termination or formation. Thus, we 
need to increase substantially the observation window to iden- 
tify whether the link has been formed and or decayed in our 
database. Short observation windows can lead to spurious 
effects: a tie that is present in one time window might (with 
large probability) do not show activity in the next time window 
due to a large inter-event time and thus we might incorrectly 
identify that event as decay of the relationship. This might be 
the origin of the large (30-40%) decay in persistence observed 
in the literature (H (and reproduced in our database, see 
Fig.[7Jb)), since the observation windows were very short (1 
month). The large probability of having a inter-event time of 
one month in human communication leads to the erroneous 
impression that 40% of the links are created/decayed in one 
month period and that the networks are highly volatile, since 
correlation between the network structure at different obser- 
vation windows is very low. 

To cure those problems in our paper we propose a different 
method to asses whether a tie formed/decayed in the obser- 
vation window £1. The method is based on the observation 
of tie activity in a time window before/after £1: if tie activity 
is observed in the 6 months before £1 then it is considered an 
old tie [cases (a) and (d) in Main Text Fig. [7J ; on the other 
hand, if activity is observed in the 6 months after £1 we will 
assume that the tie persists [cases (b) and (d) in Main Text Fig. 
[7J. In any other case, we will consider that the tie is formed 
and/or decay in £1 [cases (a), (b) and (c) in Main Text Fig. [7J. 
Of course, it is possible that even if there is no communica- 
tion before/after the observation window, the tie is still active 
after/before our database. This would require that the tie has 
an inter-event time Stij bigger than 7 months, i.e. case (e) in 
Main Text Fig. 1. However, in our database, only 3.5% of the 
links have such a long inter-event time which validates the 
accuracy of our definition of tie decay/formation. 

On the other hand in our study a tie is considered to be 
opened between its formation and decay events (if they happen 
in £1 at all). This assumption is based on the idea that an 
interaction which has been observed in the past and will be 
observed in the future might exist at a given instant even if 
there is no communication by mobile phone between at that 
particular instant. Furthermore, our observation window is 
short enough to neglect safely possible formation and decays 
of the relationship within £1. Our definition of relationship 
mitigates the excessive volatility of the social network when 
tie is considered only when interaction is observed at a given 
instant. For example, the persistence of open links is higher 
(70% in one year) than observed links (40% in one year) 
in line with off-line studies |2|. It also resembles different 
situations in which, although a strong relationships might exist 
off-line, very few calls are exchanged in time. 



Finally, understanding this difference between open and 
observed relationships is crucial to unveil the real dynamics 
of social networks because it can induce also spurious effects 
in the observations: within a given observation window, the 
(revealed) aggregated connectivity k((t) seems to grow non- 
trivially as a function of time (see Main Text Fig. [2]) within 
£1. Actually, it could even be fitted to a power law k((t) ~ t r 
with y~ 1/2 for small t. It is interesting to see that the 
functional form and exponent do fit those found in models 
of network growth [7]. But it is easy to see that this effect 
is (mostly) due the fact due the fact that different links have 
very heterogeneous number of communication events wtj and 
within a given tie events are very bursty. Specifically, the 
apparent growth of kj(t) for short times is mainly due to the 
possibly large and highly heterogeneous time to the first event 
event within ties. 

To understand that, suppose that a given tie is present 
before and after the observation window £1 and that the distri- 
bution of inter-event times within that tie is given by P(Sttj). 
Assuming that the initial time of the observation window is 
random, the time to the first observation of the link is given 
by the waiting time equation in renewal processes 



p(*ij) = = r p(8t tJ )d8t tJ 

Ota JZii 



Stij 



(1) 



Thus, depending on the properties of P (Stij) we could have 
a very large observation time (Ttj) for the link. As shown 
in Fig.[TJa) the pdf for inter-event times depends mostly on 
the average inter-event time Stij, i«e. P{&Uj) — ^(^Uj/Sttj) 
where £?(x) is a universal function. Thus, for a given Stij we 
could rewrite the previous expression as 



P(Tij\8tij) = = r &(8tij/8tij)d8t. 



(2) 



Hj JT tj 



However, ties are very heterogeneous in the sense that 
they have very different Stij. Or equivalently, they have very 
different weights wtj = T /St^ E2- Suppose that U(Stij) is 
the distribution of average inter-event times across links and 
that each user chooses her tie activities from that distribution. 
We assume also that no tie is form/destroy during the observa- 
tion time. Then the probability to observe one of her links at 
time T is given by: 



P(t) = J d8tijU(8tij)P(T\8tij) 



(3) 



Thus, the growing function of the observed connectivity as a 
function of time is given by the ccf of P(t). 



ki(t)=ki(T) f P(x)dx 
Jo 



(4) 



where k((T) is the total connectivity of node i in the observa- 
tion window £1. Note that since &(x) and U(Stij) are heavy 
tailed, then P(x) is heavy tailed too and thus the kt(t) can 
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Figure 7. (a) Rescaled inter-event time distribution for groups of edges with different average inter-event time Sttj. Each curve is 
rescaled by the value of Sty of the correspondent bin. (b) Weekly persistence p(n) of ties observed in the first week of our database 
as a function of the number of weeks n: while persistence drops to 70% after one month if ties are required to have activity at a given 
week n, it is still around 70% for one year if we consider open ties at that week, i.e. ties which where observed in the first week. 



show an apparent non-trivial time dependence even if all links 
are open during Q. Expression ^ shows that one should be 
careful to consider the observed aggregate connectivity k((t) 
as a proxy for social connectivity at any time t, since it is 
profoundly affected by the bursty and heterogeneous activity 
of human behavior encoded in P(t). Note to mention the 
effect of tie formation/destruction which is not included in 

Strikingly, an apparent k((t) ~ t r growth can be observed 
even in the case in which both tie activity and weights are 
severely bounded: if we assume that the distribution of inter- 
event times is given by the exponential pdf P(St 1 8t) = e~ 8t ' 8t /8t 
and also that the pdf for the average inter-event time is an ex- 
ponential n(5^) = e~ 5t / a /a we get exactly from Eq. (|4J) that 



ki(t)=ki(T) 




0<t<T (5) 



where K\ (x) is the Modified Bessel Function of the second 
kind [5|. Thus, for a single user the number of observed 
ties grows in a non trivial way as a function of time even 
for this homogeneous (both in the events and in the links 
properties) case, a behavior which extends further from t = a, 
the average St (see Fig. [8]). This result for a single user 
based on the universal bursty and heterogeneous activity in 
ties, together with the large heterogeneity found in social 
connectivity (which is related to k((T)) could explain the 
apparent non- trivial growth of the aggregate k((t) observed in 
social networks l23ll and highlights the importance of taking 
into consideration the heterogeneity of activity of humans to 
define properly the way we measure and observe their social 
networks. Finally these results emphasize the goodness of our 
method to detect open ties, since in this simple example all 
ties are open at any time and then iq(t) is constant throughout 
the observation window £1. 
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Figure 8. Apparent growth in the connectivity given by 
equation Q as a function of time for an exponential distribu- 
tions of average inter-event time with a = 10 days (marked 
by the arrow). Dashed line is the fit to a power-law growth 
for the initial growth (up to 5 days) that yields k((t) ~ fl 
with y= 0.53 ±0.02. 
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Figure 9. (Rescaled) Distribution of the time gap between edge creation (a) and edge removal (b) for groups of nodes with different 
activity rate Ofc, where groups have been obtained according to the quartiles of a/ for the whole population. 



C. Bursty dynamics of tie activation or de- 
activation 

We observe that most people form and destroy edges almost 
constantly in time (see Main Text Fig. [5]). However, de- 
spite the linear growth of the number of added and removed 
connections, the distribution of the time gap between cre- 
ation/removing of ties is not Poissonian (Fig|9]), which is in 
line with recent results |3|. Fig. [3] (a) and (b) show respec- 
tively the pdf of the time it takes for the node i with degree 
ki to create one more connection (S^+i) and to loose one 
connection St^k-i- Specifically, we divide the whole popula- 
tion of users in four groups depending on their value of (fy and 
plot the distribution for each group. Despite the exponential 
cut-off, the results indicate some bursty patterns of activity for 
sort times. In addition all distributions collapse into a single 
curve suggesting that a universal form for the burstiness in the 
tie activation/deactivation. 

D. Linear growth of tie activation or deacti- 
vation 

Although tie activation/deactivation events do not happen 
homogeneously in time, the strong cut off in the bursty inter- 
event time found in the previous section suggests that there 
exists a typical time scale in which those events happen and 
thus, for a larger enough observation time, we should expect 
linear growth for the accumulated number of events n a j(t) 
and ri(Qj(t). Indeed, by taking these time series and fitting 
them to linear models we get the rates (^ and (fy explained in 
the main text. The statistical significance of the fit of those to 



each individual dynamics is shown in Fig. 10 where we can 
see that the linear fit is statistically significant for a majority 
of users with n a j{T) = 5 and for most users with n a j(T) > 5 
(same results for n©,/). On the other hand, for those selected 
individuals for which p- value \ 0.05 the goodness of fit is on 
average R 2 ~ 0.91 with 93% of them with R 2 > 0.8. Thus, 



the results presented in the following section and in the main 
text for at and COt are only for those with n a ,i(T) > 5 (same 
for n^iiT)) for which the goodness of fit is around R 2 ~ 0.91 
and the percentage of those with a /?-value smaller than 0.05 
is around 100%. They amount up to 75% of the total number 
of users. 



E. Statistical evidence for the conservation 
of social capacity 

One of the key findings in our study is the fact that for a given 
individual / the rate at which ties are formed a ; equals that 
at which ties decay 0)/. This implies that social capacity, i.e. 
the number of open connections at a given instant is more or 
less constant in time. In this section we describe the analysis 
performed to reach this statement and the null model used 
to asses the statistical significance of a, c± cty. The basic 
problem is the fact that for most of our users in the database, 
the number of events n a j and n^^ is very small and then we 
get large differences between the values of a ; and COt obtained. 
We will test that our results are comparable statistically 
to a null model in which ties are formed and destroyed in the 
observation window £1 according to two different realizations 
of a Poisson process with the same rate a = CO. The choice 
of Poisson process as the renewal process that describes the 
formation and decay process is supported by the bounded 
probability distribution for the inter-event times between for- 
mation/decay of events seen in previous section. Of course 
this is an approximation, because there is a large probability of 
bursts of formation/decay events than predicted by the expo- 
nential distribution of the Poisson process. The approximation 
works better for large times or number of events, since in that 
limit the strong decay of the inter-event time distribution for 
large values makes the process to converge to the behavior of 
a Poisson process very quickly by means of the Central Limit 
Theorem [8]. 
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Figure 10. Ratio of the number users for who a linear fit to 
the n a j(t) rsj ccrf (red) and n^^t) ~ coit (blue) time series 
has a p- value smaller than 0.05 for the F-test. Different 
columns refer to different groups of users according to their 
total number of activated/deactivated ties in the observation 
period £1. 



Since there is a large heterogeneity of social activity in our 
database we take as input for our null model the actual values 
of ricoj to incorporate that heterogeneity in our null model. We 
have also done simulations taking n a j and the results are the 
same. Thus, our Monte Carlo simulations of the null model 
are as follows: for every individual / we take A; = n^t/lYl 
as the rate for tie formation and decay of ties per day and 
simulate two Poisson processes in the observation window £1 
with the same rate, one for the formation of ties and the other 
for the decay of ties. We then calculate the times series of the 
aggregate number of events n a j(t) and h(Q^(t) and fit them to 
linear models to obtained the simulated &; and <fy. In line with 
the results of previous section, we only consider for the fit 
those simulations for which the n a ,i(T) > 5 and h^^T) > 5 
in the fit. 

As shown in the caption of Main Text Fig. ^3p (see details 
there), the observed values for n a ,i(T) and n^^T) in our 
database can be well explained by our simulations, suggesting 
that our model works well at that particular time scale. We 
also find a good agreement between the measured values of 
at and C0t and the results of our null model as shown in Main- 
Text Fig. |3j:, although there a small amount of outliers that 
cannot be explained by our model. 



F. Measuring neighborhood persistence 

We measured the persistence p { of a user i as the fraction of his 
neighbors present at the beginning of the observation period 
£1 that are maintained until the end of £1. Specifically, p\ = 
(Si(0)nSi(T))/Si(0) 9 where ^-(0) and S t {T) are respectively 
the set of ties that user / has at time and time T (see Main 
Text Fig. [7]). Once measured pt for all users in our dataset, 
we find that the average persistence p t is 0.75. As discussed 
in the main text, this suggests that although in a given time 
period users activate and deactivate many connections (on 
average half of their social connectivity), after a period of 
7 months they maintain on average the 75% of their initial 
social network. 

We also mentioned that this value is much larger than 
the one obtained in a model in which each tie is activated 
and deactivated with the same probability, suggesting that 
as expected individuals do not establish or remove social 
connections randomly. To address the latter, we simulated 
the following process: for a given user we preserve (i) all 
the properties of his measured social strategy (ku *Q, n a ^ 
nco,i) and (ii) the real sequence of both his tie activation and 
deactivation time instants. Thus, following the order of such 
sequences, at each activation (deactivation) time we allow the 
user to add (remove) one of his neighbors randomly chosen 
among all his neighbors. Note that in the random model we 
maintain all the properties of the individual social network 
and strategy and the only thing that we destroy is the selection 
of neighbors added and/or removed within the observation 
period £1. We then repeat this process for all users in our 
dataset and for each of them we measure the new network 
persistence p\. As discussed in the main text, we found that 
p\ = 50%, against the p t = 75% measured for the real case, 
which suggests that the way in which people activate, maintain 
and deactivate social relationships is, as expected, not random 
and some ties are more probable to be destroyed than others. 

G. Relation of the social strategy with topo- 
logical properties 

We find a significant dependence between the social strategy 
for an individual (encoded through the parameter %•) and the 
topological properties around that individual. Specifically, 
figure [TT] shows how the persistence defined in the previous 
section depends heavily on % but shows a large independence 
with the total connectivity of individuals in the period of ob- 
servation. Specifically social keepers (those with y < 0.2) do 
show a large persistence in their social neighborhood (even 
up to 90%), while social explorers (y > 2) only keep a small 
fraction of their initial ties at the end of the 7 months period, 
even down to 40%. On the other hand, the aggregated clus- 
tering coefficient also depends on the social strategy: social 
keepers tend to have more clustered neighborhoods than so- 
cial explorers. Specifically, we find that c\ can be up to 0.22 
for social keepers, while it decreases to 0.05 for social explor- 
ers. Note that in the case of the clustering coefficient we also 
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Figure 11. Relation of the social strategy with topological properties: dependence of the average persistence of ties (A) and 
aggregated clustering (B) as a function of the total connectivity k\ and social strategy Ji- (Q Average value of next neighbor 
connectivity k nn j of a node as a function of its own connectivity k{. The Pearson correlation coefficient between the two quantities 
is p(ki,k nn j) = 0.342 with a confidence range of [0.278,0.316] (D) Average value of the parameter y nn j for the neighbors of an 
individual as a function of her own value of Ju p(Yi,Ynn,i) = 0.412 with confidence interval [0.394,0.429]. A clear growth can be 
seen in both cases, indicating a strong assortativity. 



observe that it decreases with increasing average connectivity, 
a effect well known in social networks [7]: c(kj) is typically a 
decreasing function with k( reflecting the fact that for largely 
connected people it is increasingly more difficult to have a 
moderate clustering. However, in Fig. [4] we see that the clus- 
tering not only depends on the connectivity, but also on the 
social strategy. Since both factors have opposite effect on 
clustering we find, for example, that social keepers with large 
connectivity might have the same clustering as social explor- 
ers with small connectivity. Thus the aggregated clustering 
found in social networks is a function of both connectivity 
and social strategy, suggesting that its value is determined dy- 
namically by the tie formation/destruction processes around a 
given individual. 

Finally, in our database we observe that social connectivity 
is assortative, in line with other studies |7'|. More interest- 
ingly, we find that also social strategies of communication are 
assortative, as it is shown in Fig. [4] As mentioned in the main 
text, this result indicates that people that establish and remove 
many connections from their network at a high rate (social 
explorers) are more likely to interact with people that also 
change their network quickly. Analogously, those individuals 
that maintain a more stable social network (social keepers) 



also interact with people with the same strategy. As a con- 
sequence, the large volatility observed in the neighborhood 
of social explorers also extends to large proportions of the 
network around them and the same applies for social keepers. 
The global network thus consists of almost static zones of so- 
cial keepers and high volatile clusters of social explorers that, 
as discussed in the main text, also have important implications 
in terms of information diffusion. 



H. Facebook data set 

We have also analyzed other communication data to test our 
results. In particular, we have studied the 90,269 users of 
the New Orleans Network crawled during by Viswanath et al. 
l4l . The data consists of communication events between users 
through Facebook wall from September 26th, 2006 to January 
22nd, 2009. Contrary to the mobile phone data, the Facebook 
data is not steady in time, since the database extends over the 
early days of Facebook growth and thus it shows a growth in 
the activity over years, which translates in more wall posts 



and also more users as a function of time (see Fig. [12]). 

To minimize this effect we have chosen only communi- 
cation events between users that did show any activity in the 
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Figure 12. Activity in the Facebook database. Number of 
communications through the wall in our database for periods 
of 30 days. Dashed lines show the limits of the observation 
time window £1. 



observation window £1 (the time interval between 1000 and 
1212 days in the database) and also which were present 20 
days before and after £1. We do not consider the links to be 
reciprocated in order to have more data accessible for our 
analysis. With this filter our database contains 125 x 10 3 
communication events of ~ 10 4 users and 69 x 10 3 ties. On 
average, users interact with (k((T)) =6.15 users in 7 months 
and the social activity is (n a ,i(T)) = 3.01, (n^^T)) = 3.02 
ties formed and decayed respectively. Our results are very 
similar to the ones observed for mobile phone data, namely 
that social activity is roughly half of the social connectivity 
in 7 months. However, users show a lower level of wall activ- 
ity: for example, 40% of the users are involved in less than 
10 communication events through the wall in seven months 
(while in the mobile phone data the average number of calls 
exchanged per user was ~ 700 in seven months). Thus, to 
determine the social dynamical strategies in Facebook data 
we concentrate on those users that show a moderate level of 
communication, i.e. those that have more than 10 events in the 
7 months of £1. For those users in our database we find that 
n a ,i(T) ~ n(oj{T) and a* c± co t , signaling that users in Face- 
book tend also to conserve the number of open connections 
Kt(t) in time (see Fig. [6] (a) and (b)) . On average we find that 
(k*(0) =3.23. Finally, as in the mobile phone data we find 
also a relationship between the capacity and the activity of 
users: in particular, 81% of the variance can be explain by the 
relationship n a ,i = 1.047Q (see Fig. [61(c)). 

In addition, as in the mobile phone network, we find a 
large assortativity not only in the social connectivity, but 
also and more importantly in social dynamical strategies, i.e. 
individuals with low y (social keepers) tend to gather in the 
social network, while social explorers tend to interact between 
them (see Fig. 



14). Our results show that the dynamical 
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strategies of communication between users through Facebook 
wall also follow the same pattern as in mobile phone. 
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Figure 13. Social dynamics in the Facebook database, (a) Relationship between the number of formed n a ^ and decayed n®^ ties in 
the observation window for the users in our database. The box plot shows the 25% and 75% percentiles (filled box) and 5% and 
95% percentiles (whiskers), the solid black line is the relationship n a j = n^^ and the blue curves correspond to the 5% and 95% 
percentiles of the corresponding Poisson null model in SI section E for our data, (b) Density plot p( ft)/, af) for the users with more 
than 2 ties formed and decayed. Dashed line is the at = ft)/ and the curves correspond to the contour lines p = 0.03 for the density of 
actual values of the rates (red) and the ones obtained in the Poissonian model in SI section E (blue), (c) Log-density plot of the social 
activity n a ,i and the social capacity "/q. Dashed lines correspond to the iso-connectivity lines kf(T) = 10, 20, 50 and the solid line is 
the relationship n a t = 1.047q obtained through PC A that explains 81% of the variance. 
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Figure 14. Assortativity of connectivity and social strategy in Facebook social network, (a) Average next neighbor connectivity of a 
node k nn j as a function her own connectivity k\ for the 10 4 users in the Facebook datase. A clear grown can be seen, signaling an 
assortativity in this social network, with a Pearson correlation coefficient p(ki,k nn j) = 0.257 with confidence range [0.238,0.275]. 
(b) Average value of the parameter ji for the neighbors of an individual as a function of her own value of %•. Similarly to k\ we 
observe a clear growth and a Pearson correlation coefficient p()j, y nn ,i) = 0.197 with confidence range [0.177,0.217]. which indicates 
a strong assortativity of the social dynamical strategies in the Facebook database. 



